5 research outputs found

    An in-depth look at prior art in fast round-robin arbiter circuits

    Get PDF
    Arbiters are found where shared resources exist such as busses, switching fabrics, processing elements. Round-robin is a fair arbitration method, where requestors get near-equal shares of a common resource or service. Round-robin arbitration (RRA) finds use in network switches/routers and processor boards/systems as well as many other applications that have concurrency. Today's electronic systems require arbiters with hundreds of ports (e.g., switching fabrics with virtual I/O queues) and clock speeds near the limits of even the latest microelectronics fabrication processes/libraries. Achieving high clock speeds in the presence of large number of ports is only possible with highly parallel arbiter architectures. This paper presents an in-depth literature survey of previous work on this problem. It looks at RRA work in the literature in a bigger context, then defines the typical RRA problem (RRA_typical), and specifically investigates work on fast architectures that solve the RRA_typical problem. There are five such works that are really competitive. This report takes a very in-depth look at these works. It explains each architecture and how/why it works from a unique perspective that cannot be found in the original publication of that architecture. It also proposes improvements to these architectures. We wrote generators for the improved versions of these architectures. We will share a summary of synthesis results in this report – although a detailed account of how these results were obtained and their analysis is the subject of another (upcoming) publicatio

    Fast parallel prefix logic circuits for n2n round-robin arbitration

    No full text
    Due to copyright restrictions, the access to the full text of this article is only available via subscription.An n2n round-robin arbiter (RRA) searches its n inputs for a 1, starting from the highest-priority input. It picks the first 1 and outputs its index in one-hot encoding. RRA aims to be fair to its inputs and maintains fairness by simply rotating the input priorities, i.e., the last arbitrated input becomes the lowest-priority input. Arbiters are used to multiplex the usage of shared resources among requestors as well as in dispatch logic where the purpose is load balancing among multiple resources. Today, arbiters have hundreds of ports and usually need to run at very high clock speeds. This article presents a new gate-level RRA circuit called Thermo Coded-Parallel Prefix Arbiter (TC-PPA) that scales to any number of requestors. It uses parallel prefix network topologies (borrowed from fast carry lookahead adders) to generate a thermometer-coded pointer, thus greatly reducing critical path. Code generators were written not only for TC-PPA but also for the 5 highly competitive circuits in the literature (9 including their variants), and a rich set of timing/area results were obtained using a standard-cell based logic synthesis flow with a novel iterative strategy based on binary search. Synthesis runs include results with wire-load and without. Results show that for 54 or more ports (except 256) TC-PPA offers the best timing (lowest latency) as well as competitive area. Contributions also include transaction-level simulations that show when pipelining is used to boost clock rate, latency and input FIFO sizes are adversely affected, and hence pipelining cannot be indiscriminately exploited to trim clock period

    Fast parallel prefix logic circuits for n2n round-robin arbitration

    No full text
    Due to copyright restrictions, the access to the full text of this article is only available via subscription.An n2n round-robin arbiter (RRA) searches its n inputs for a 1, starting from the highest-priority input. It picks the first 1 and outputs its index in one-hot encoding. RRA aims to be fair to its inputs and maintains fairness by simply rotating the input priorities, i.e., the last arbitrated input becomes the lowest-priority input. Arbiters are used to multiplex the usage of shared resources among requestors as well as in dispatch logic where the purpose is load balancing among multiple resources. Today, arbiters have hundreds of ports and usually need to run at very high clock speeds. This article presents a new gate-level RRA circuit called Thermo Coded-Parallel Prefix Arbiter (TC-PPA) that scales to any number of requestors. It uses parallel prefix network topologies (borrowed from fast carry lookahead adders) to generate a thermometer-coded pointer, thus greatly reducing critical path. Code generators were written not only for TC-PPA but also for the 5 highly competitive circuits in the literature (9 including their variants), and a rich set of timing/area results were obtained using a standard-cell based logic synthesis flow with a novel iterative strategy based on binary search. Synthesis runs include results with wire-load and without. Results show that for 54 or more ports (except 256) TC-PPA offers the best timing (lowest latency) as well as competitive area. Contributions also include transaction-level simulations that show when pipelining is used to boost clock rate, latency and input FIFO sizes are adversely affected, and hence pipelining cannot be indiscriminately exploited to trim clock period

    Fast two-pick n2n round-robin arbiter circuit

    No full text
    Due to copyright restrictions, the access to the full text of this article is only available via subscription.A regular (one-pick) round-robin arbiter circuit picks one active requester (if any) out of n requesters. A two-pick round-robin arbiter selects up to two requesters. An n2n two-pick round-robin arbiter indicates the picked requests with (at most) two-hot n-bit output. A round-robin arbiter is fair to its requesters and does this by repeatedly moving its highest priority pointer to the position immediately next to the second requester picked. Presented is the circuit architecture and VLSI implementation of a new scalable two-pick round-robin arbiter with low latency, which is compared with previous work based on logic synthesis results

    Fast two-pick n2n round-robin arbiter circuit

    No full text
    Due to copyright restrictions, the access to the full text of this article is only available via subscription.A regular (one-pick) round-robin arbiter circuit picks one active requester (if any) out of n requesters. A two-pick round-robin arbiter selects up to two requesters. An n2n two-pick round-robin arbiter indicates the picked requests with (at most) two-hot n-bit output. A round-robin arbiter is fair to its requesters and does this by repeatedly moving its highest priority pointer to the position immediately next to the second requester picked. Presented is the circuit architecture and VLSI implementation of a new scalable two-pick round-robin arbiter with low latency, which is compared with previous work based on logic synthesis results
    corecore